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ABSTRACT 

Black hole masses are tightly correlated with the stellar velocity dispersions of the bulges which 
surround them, and slightly less-well correlated with the bulge luminosity. It is common to use 
these correlations to estimate the expected abundance of massive black holes. This is usually done 
by starting from an observed distribution of velocity dispersions or luminosities and then changing 
variables. This procedure neglects the fact that there is intrinsic scatter in these black hole mass- 
observable correlations. Accounting for this scatter results in estimates of black hole abundances 
which are larger by almost an order of magnitude at masses > lO 9 Af . Including this scatter is 
particularly important for models which seek to infer quasar lifetimes and duty cycles from the local 
black hole mass function. However, even when scatter has been accounted for, the M, — a relation 
predicts fewer massive black holes than does the M, — L relation. This is because the a — L relation 
in the black hole samples currently available is inconsistent with that in the SDSS sample from which 
the distributions of L or a are based: the black hole samples have smaller L for a given a, or larger a 
for a given L. The a — L relation in the black hole samples is similarly discrepant with that in other 
samples of nearby early-type galaxies. This suggests that current black hole samples are biased: if 
this is a selection rather than physical effect, then the M. — a and M. — L relations currently in the 
literature are also biased from their true values. 

Subject headings: galaxies: elliptical — galaxies: fundamental parameters — black hole physics 



1. INTRODUCTION 

The abundance of supermassive black holes is the sub- 
ject of considerable current interest (e.g. Yu & Tremaine 
2002; Marconi et al. 2004; McLure & Dunlop 2004; 
Shankar et al. 2004; Yu & Lu 2004; Ferrarese & Ford 
2005). Several groups have noted that galaxy formation 
and supermassive black holes growth should be linked, 
and many have modeled the joint cosmological evolu- 
tion of quasars and galaxies (see, e.g., Monaco et al. 
2000; Kauffmann & Haehnelt 2001; Granato et al. 2001 
Cavaliere & Vittorini 2002; Cattaneo & Bernardi 2003 
Haiman et al. 2004; Hopkins et al. 2006; Lapi et al. 2006 
Haiman et al. 2006 and references therein). Since the 
number of black hole detections to date is less than fifty, 
their abundance is estimated by using secondary indica- 
tors. In particular, M, is observed to correlate strongly 
and tightly with the velocity dispersion of the surround- 
ing bulge (e.g. Ferrarese & Merritt 2000; Gebhardt et al. 
2000; Tremaine et al. 2002). Since detecting bulges is 
considerably easier than detecting black holes, it has be- 
come common to estimate the abundance of black holes 
by combining the observed distribution of bulge velocity 
dispersions (e.g. Sheth et al. 2003) with the observed 
M. — a relation. A crude estimate follows easily if one 
is willing to assume that all bulges host black holes, and 
that the M. — a relation has no intrinsic scatter (e.g. Yu 
& Tremaine 2002; Aller & Richstone 2002). 

There is some discussion in the literature about 
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whether L or a is a better predictor of M,. There 
are two parts to this statement which are not always 
stated explicitly. The first is the assumption that the 
M t — observable relation is a single power law; whether 
this is a better approximation for L than for a is an 
open question, although Lauer et al. (2007) argue that 
the curvature in the a — L relation for massive early- type 
galaxies (Bernardi et al. 2007a; Bernardi 2007) suggests 
that the M m — a relation is unlikely to be a single power- 
law. In what follows, we will assume the relations in 
question are indeed single power laws. 

The second is the issue of the scatter around the mean 
relations. It is generally believed that the relation with 
smaller scatter provides the better estimate of the M. 
distribution. Indeed, Marconi et al. (2004) state that 
if the scatter around two relations is similar, then both 
relations should provide equivalent descriptions of the 
distribution of M # . One of the goals of the present paper 
is to show that this is not the whole story. Provided the 
intrinsic scatter around the two relations is accurately 
known, whether or not one relation is tighter than an- 
other is irrelevant. (The only practical difference is that, 
if the intrinsic scatter is smaller, then observations of 
fewer objects are required to estimate it reliably.) 

Both the M, — a and M. — L relations show consid- 
erable scatter, not all of which can be accounted-for by 
measurement errors. Marconi & Hunt (2003) present ev- 
idence that the amount by which an object scatters from 
these relations is correlated with bulge size (half light 
radius) , suggesting that at least some component of the 
scatter is intrinsic. Gebhardt et al. (2000) suggest that 
the intrinsic scatter in M. at fixed velocity dispersion is 
of order 0.25 dex, whereas scatter around M. — Ly is 
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about 0.35 dex (e.g. Novak et al. 2006). If the intrinsic 
scatter is indeed this large, then it must be accounted for, 
especially when estimating the abundances of the most 
massive black holes (M. > 10 9 M Q ). 

Section 2 describes a toy model of the effects of scatter 
which shows that, (i) if intrinsic scatter is ignored, then 
both the L- and cr-based predictions will underestimate 
the true abundance of the most massive black holes; (ii) 
the observable which correlates most tightly with M, will 
provide the best estimate of the true abundance of the 
most massive black holes; (iii) if scatter has been cor- 
rectly accounted for, a- and L-based predictors of M, 
abundances should give the same answer. It then shows 
the M, — a, M, — L and a — L correlations, their scatter, 
and how we use them to estimate black hole abundances. 
A direct comparison of the luminosity and velocity dis- 
persion based predictors is provided, both when intrinsic 
scatter in these relations is accounted for and when it is 
ignored. 

We find that, if scatter is ignored, then the L-based 
method predicts substantially more 10 9 M Q objects than 
does the <7-based method. The toy model suggests that 
this may be a consequence of ignoring the intrinsic scat- 
ter. However, accounting for this scatter does not elim- 
inate this discrepancy, suggesting that there may be a 
more serious inconsistency. Section 3 identifies the rea- 
son for this discrepancy with the fact that the a — L 
relation in the SDSS/Bernardi et al. (2003a) (hereafter 
SDSS-B07; see Bernardi 2007 for the definition of the 
SDSS-B07 sample), from which the L and o distributions 
are drawn, is rather different from that in the black hole 
samples, from which the M. — a and M. — L relations 
are derived. A more detailed analysis of the role of se- 
lection effects in the M, sample is presented in Bernardi 
et al. (2007b). A final section discusses our findings and 
summarizes our conclusions. A standard flat ACDM co- 
mological model has been used, with f2 m = 0.3, = 0.7 
and H Q = 70 km s _1 Mpc" 1 . 

2. BLACK HOLE ABUNDANCES FROM 
M.-OBSERVABLE CORRELATIONS 

The first part of this section discusses the effect of in- 
trinsic scatter in M.-observable relations on inferences 
about black hole abundances. The second part shows 
various M.-observable correlations in the compilation of 
Haring & Rix (2004). The third and fourth parts of this 
section show the predicted black hole abundances when 
intrinsic scatter in these relations is accounted for and 
when it is not. 

A detailed discussion of exactly how the black hole 
sample was compiled, as well as how we convert from B, 
V, R and I-band luminosities to SDSS r— band is pro- 
vided in Appendix A of Bernardi et al. (2007b). Briefly, 
all luminosities and black hole mass estimates depend on 
distance: where necessary, these were computed by scal- 
ing results in the literature to H = 70 km s _1 Mpc -1 . 
The estimated velocity dispersions are, essentially, dis- 
tance independent (see Bernardi et al. 2007b for details). 

2.1. A simple model of the effect of intrinsic scatter 

Consider three observables which we will call i, V and 
M., with joint distribution p(L, V, M,). To make the 
discussion more concrete, suppose that this joint distri- 
bution is Gaussian, so that this distribution is completely 



specified by the means and variances of the three vari- 
ables, and the three cross-correlation coefficients tvm. , 
tlm., and tlv- These correlation coefficients are con- 
strained to lie between ±1, with a value of zero indicating 
no correlation. Then the distribution of M. at fixed O, 
with O — L or V, is Gaussian with mean and variance 

(M.\0) = (M.) + r OM . o-m. (O - (0))/a , (1) 

°M.|0 = a M. I 1- r OM.)- ( 2 ) 

Let po(M t ) denote the result of predicting the distribu- 
tion of M. from the distribution of O by using (M.|(3) 
to change variables from p(0)dO — po(M,)dM,. Then 
po(M,) is a Gaussian centered on (M.) with rms = 
\ r OM. | cm. . Unless roM. — ±1, this value will be smaller 
than cm. • Thus, in general, (i) pv(M m ) ^ Pl(M») and 
(ii) both will be more sharply peaked than the true p{M,) 
distribution. (In the limit roM. — * 0, i.e., the limit of no 
correlation between O and M # , po{M,) becomes a delta 
function centered on the mean value; this behaviour is 
the basis for the concept of 'shrinkage towards the mean' 
which is common in discussions of Bayesian statitical in- 
ference. ) Hence, except in the case of perfect correlation 
between M. and O, all choices of O are biased — there is 
little reason to prefer the estimate from one observable 
over another. 

On the other hand, although both pv (M,) andpi(M.) 
will underestimate the true distribution p(M.) at large 
M # , the discussion above shows that the distribution of 
the observable which correlates more strongly with M. 
will be closer to the true p(M,). In particular, at large 
M. , the cumulative distribution of the observable which 
correlates more strongly with M, will be closer to the 
true p(> M.). So one might argue that the observable 
which predicts the largest po(M,) at the largest M, is 
the one which is closest to yielding the true value. (Of 
course, this is only true in an ideal world in which there 
are no systematic measurement errors.) 

In effect, the procedure just described ignores the scat- 
ter around the mean {A1,\0) relation. To include the 
effects of this scatter one must convolve 4>{0) with the 
distribution p(M, \0) which has mean (M,\0) and rms 

°~M.\0 : 

4>(M.) = J dO0(O)p(M.\O) (3) 

Provided (M,\0) and <Jm.\o are accurately known, it 
doesn't matter what O is, or how tightly correlated it is 
with M. . That is to say, predicting the distribution of 
M. from L using the expression above should give the 
same (correct) answer as predicting it from V. 

If this does not happen, i.e., if the setting of O = L 
gives a different answer than O = V, then this is an in- 
dication that one or more of the p(M, \0) relations are 
incorrect. This may happen, for instance, if (f)(L) and 
<p{V) are estimated from a different dataset from which 
the M, — L and M, — V correlations are estimated, since, 
if the datasets are not the same, then there is no guar- 
antee that the joint M, — L — V distributions in the two 
datasets are the same. We argue below (see Section 3) 
that this appears to be the case: the V — L correlation 
defined by the black hole samples in the literature dif- 
fers from that in the SDSS-B07, which currently offers 
the best determinations of (j)(L), (j)(V) and perhaps also 
V — L (see Bernardi et al. 2007b). 
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Fig. 1. — Distribution of black hole masses predicted by combin- 
ing the M, — L relation (equation A[9jl with the SDSS luminosity 
function and ignoring (dashed) or including (solid) the effect of 
0.33 dex scatter around the mean relation. In each case, bottom 
curve uses the SDSS luminosity function of Blanton ct al. (2003), 
and the top curve uses Blanton et al. augmented with the BCG 
luminosities of Hyde et al. (2007). 

2.2. The (M,\L) and (M,\o~) relations 

The discussion above makes clear that, if O is to pre- 
dict M m , then the correlation of interest is (M,\0). Use 
of the (inverse of the slope of the) (Q\M m ) correlation for 
this purpose is clearly incorrect. For similar reasons, it is 
logically inconsistent to use fits to the M. — O correlation 
which treat M % and O symmetrically, such as bisector or 
orthogonal fits. Thus, although it is commonly used, the 
M. — a relation reported by Tremaine et al. (2002) is 
not the appropriate choice for this problem. 

Therefore, we have performed our own fits to the re- 
lations we require. The fitting procedure we use is de- 
scribed in the Appendix, as are the results of fits to the 
Haring & Rix (2004) compilation. 

2.3. Effect of scatter in the M, — L relation 

To estimate </>(M») we need both p(logM, \L) and the 
distribution of L. We use the r-band SDSS luminosity 
function (Blanton ct al. 2003) as our basic function, 
augmented so that it includes a better estimate of the 
light from the most luminous galaxies. 

Briefly, the SDSS photometric pipeline tends to under- 
estimate the luminosities of bright galaxies in crowded 
fields and of nearby bright galaxies by more than 0.5 
mag (Bernardi et al. 2007a; Lauer et al. 2007; Hyde 
et al. 2007). The magnitudes of the main galaxy sam- 
ple are biased low by ~ 0.1 mag (see Bernardi 2007 for 
a discussion of the systematics in the magnitudes and 
velocity dispersion in the SDSS database and compar- 
isons with the Bernardi et al. 2003a sample). Since 
these bright galaxies are likely to be massive galaxies, 
they are likely to host massive black holes, so it is im- 
portant to correct for this bias. However, doing so is 
complicated by the fact that the light profiles of these 
objects are not standard. Hyde et al. (2007) believe 
that the light profiles are the sum of two components (a 
galaxy plus inter-cluster light), and only assign the light 
from the inner component to the object. (Assigning all 
of the integrated surface brightness to the galaxy makes 
the discrepancies described below even larger.) 



Fig. 2. — Luminosity and velocity dispersion-based predictions 
for the distribution of black hole masses. Curves labeled Sheth 
et al. were obtained by combining the (M, \a) relation of equa- 
tion (P\E$ with the observed distribution of velocity dispersions 
(from Sheth et al. 2003). Curves labeled Blanton+Hyde were ob- 
tained by combining the (M,\L) relation of equation (A[9]l with 
the observed distribution of luminosity from Blanton et al. (2003) 
and Hyde et al. (2007). The dashed curves assume there is no 
intrinsic scatter around the (M, (observable) relations, whereas the 
hashed regions are bounded by curves in which the intrinsic scatter 
around the relation was assumed to be 0.22 ± 0.06 dex for (Af.|rr), 
and 0.33 ± 0.08 dex for (M.\L) (see Appendix). 

The effect of adding these objects to the luminosity 
function, and then transforming to a distribution of black 
hole masses using equation is shown by the dashed 
lines in Figure [TJ The effect at the luminous end is dra- 
matic: Blanton + Hyde exceeds Blanton alone by many 
orders of magnitude. 

These estimates of black hole abundances ignore the 
effects of intrinsic scatter in the M, — L relation. The 
solid curves in Figure [1] show the result of transforming 
to a distribution of black hole masses using equation (A[5]) 
and accounting for scatter of 0.33 dex using equation ([3]). 
Including the scatter increases the expected </>(M.) noti- 
cably at M. > 1O 8 !5 M ; by M. > 1O 9 5 M ignoring the 
scatter results in an underestimate of more than an order 
of magnitude. In fact, Blanton + scatter exceeds Blanton 
+ Hyde at almost all M,. In this respect, accounting for 
scatter is more important than is getting details of the 
light profile correct. 

2.4. Abundances from the correlation with a 

Figure [5] shows the results of repeating this analysis, 
but now with (log M. | log a) and the distribution of ve- 
locity dispersions reported by Sheth et al. (2003). (A 
word on this choice is necessary, since Bernardi et al. 
2006 note that there may be more systems in the SDSS 
with tr > 400 kms -1 than the Sheth ct al. fitting for- 
mula yields. However, HST imaging shows that most of 
the abnormally large a objects in Bernardi et al. (2006) 
are objects in superposition; the shape of the Sheth et 
al. velocity function does not need to be augmented by 
more systems at a > 400 kms -1 ). For ease of comparison 
with the luminosity function results shown in the previ- 
ous subsection, we have used d(j)(a) / da shown in the final 
figure of Sheth et al. — this adds an estimate of the con- 
tribution of spiral bulges to the measured distribution 
of early-type galaxy velocity dispersions. Note that this 
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Fig. 3. — Accounting for the difference between bulge and total 
luminosity brings the L-based estimate of black hole abundances 
into better agreement with that based on cr, although the differ- 
ences at M m > 10 9 Mq remain. 

makes essentially no difference at the massive end. 

The lowest dashed line in the figure shows the expected 
abundance of supermassive black holes if one ignores the 
intrinsic scatter in the (log M, | log a) relation, and the 
lower hashed region shows the predicted range if this 
scatter is between 0.16 and 0.28 dex (i.e. 0.22 ±0.06 dex, 
see Appendix). The scatter clearly increases the ex- 
pected numbers of massive black holes significantly. To 
appreciate the magnitude of the effect, the upper set of 
curves show the expected abundances based on Blanton 
+ Hyde combined with the (M,\L r ) relation of equa- 
tion (AEJ without scatter (upper dashed curve) and with 
scatter (upper hashed region) between 0.25 to 0.41 dex 
(i.e. 0.33 ± 0.08 dex, see Appendix). Notice that the 
er-based prediction when scatter is included is similar to 
the L-based prediction when scatter is ignored. 

There is a small inconsistency here which we have in- 
vestigated but which does not affect our main conclu- 
sion. Namely, L in the M, — L relations reported ear- 
lier refers to the bulge luminosity. Whereas the bulge 
accounts for all the luminosity at large L, it accounts 
for a decreasing fraction at lower L. We have found 
that a crude model which sets Lbuigc = f(L) L, with 
f(L) = (L/L*)/(l + L/L*) yields bulge luminosity den- 
sities which are 40% of the total luminosity density in 
the g and r-bands, in good agreement with current es- 
timates. Figure [3] shows the result of incorporating this 
model for f(L) into our estimates of 4>(M,). Doing so 
brings the L- and c-based estimates into good agreement 
at M, < 10 7 - 5 M©. However, since /(£)—> 1 at large L, 
the large differences at M, > 10 9 M & remain. 

3. PROBLEMS AND INCONSISTENCIES 

The smaller intrinsic scatter around (M. |er) as com- 
pared to (M,\L) (equations and AO suggests that 
tmv > r ML (equation |5J), so py(M.) should predict 
more massive black holes than pl(M,). Figure [2] shows 
the opposite trend: the estimate based on the Blanton 
et al. (2003) luminosity function is well in excess of that 
based on the Sheth et al. (2003) velocity dispersion func- 
tion. This is true even before adjusting the Blanton et 
al. function upwards at large L to account for BCGs. 
This indicates that something has gone wrong with the 



logic of the previous section. 

Furthermore, the analysis of the previous section sug- 
gested that, once scatter has been accounted for, both L- 
and <7-based methods should give the same prediction. 
Figures [2] and [3] show that the luminosity based predic- 
tions are still much larger than those based on velocity 
dispersion. In this respect, our findings differ markedly 
from those of McLure & Dunlop (2004), Shankar et al. 

(2004) and Marconi et al. (2004) who reported that, 
once scatter had been included, the two estimates agree. 
As we discuss below, this is because they made different 
choices for the shape and scatter of the M. -observable 
correlations. Whereas McLure & Dunlop, and Shankar et 
al. have approximately the same slope as equation (A^J), 
they are shifted to smaller zero-points. Marconi et al. 
have a shallower slope for M, — L, the zero-point of their 
M, — g is larger, and they make a different choice for the 
scatter. 

3.1. Comparison with previous work 

The left hand panel of Figure Q] compares various de- 
terminations of (M,\L) . (In this figure we also show data 
from Kormendy & Gebhardt (2001) and Ferrarese & Ford 

(2005) — although we only use their measurements of the 
objects which are in common with Haring & Rix (see 
Appendix A in Bernardi et al. 2007b for a decription of 
how the black hole sample was compiled). We use these 
other measurements primarily to demonstrate the uncer- 
tainties on the measurements — all the fits we show and 
use come only from the Haring & Rix data.) 

Both the McLure & Dunlop (2004) and Shankar et al. 
(2004) results are based on the determination of M, — L 
by McLure & Dunlop (2002): logM. = -0.5M R - 2.91. 
McLure & Dunlop (2002) say that this determination 
assumes Hq = 50 km s _1 Mpc -1 . Shankar et al. 
(2004) say that the result of shifting this relation to 
Hq = 70 km s _1 Mpc -1 is to change the zero point from 
—2.91 to —2.69. This results from rescaling both the lu- 
minosities and the black hole masses: the net shift is 
1.251og(70/50) 2 -log(70/50) = 0.22, the first term com- 
ing from shifting the luminosities, and the second from 
the masses. Note that this rescaling would be appropri- 
ate if both M, and L in the 2002 paper assumed the same 
Hq, but would be inappropriate if not. 

Shankar et al.'s (2004) shift differs slightly from that 
of McLure & Dunlop (2004) who state that, if H = 
70 km s" 1 Mpc -1 and R- K = 2.7, then their fit from 
2002 implies logM. = 1.25 log L K /L QK - 5.76. Now, 
Mqk = 3.28, so the right hand side of this relation is 
-0.5 (M R - 2.7 - 3.28) - 5.76 = -0.5 M R - 2.77. This 
relation predicts M. that are lower by 0.08 dex than does 
the relation used by Shankar et al. (2004). The source of 
this discrepancy is unclear, but it is a curious coincidence 
that log(70/50) = 0.146 is close to the 2.91 - 2.77 shift 
that McLure & Dunlop require: this would be the shift 
if M, oc .Lb, rather than cx L^ 25 . 

The McLure & Dunlop (2004) and Shankar et al. 
(2004) relations are shown as the dotted and dot-dashed 
lines in Figure [4] Both have been shifted from R to 
r using r — R = 0.27, and both lie below the Haring 
& Rix data. To study why, we returned to the issue of 
whether or not both L and M m should have been rescaled 
when Hq was changed. McLure & Dunlop (2002) re- 
port that their M, — a relation is essentially the same 
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Fig. 4. — Left: Correlation between M m and bulge luminosity. Symbols show measurements from a variety of data sets, solid line shows 
the fit reported in equation f A[9t. and dashed line shows the fit from McLure & Dunlop (2002) once the difference in Hubble constant has 
been accounted for. Dotted, dot-dashed and long-dashed lines show the fits used by McLure & Dunlop (2004), Shankar et al. (2004) and 
Marconi et al. (2004), respectively. In all cases, the fits and data have been shifted to the r band (using B — r = 1.25, V — r = 0.34, 
r — R = 0.27, r — I = 1.07 and r — K = 2.7). Right: Correlation between Mm and velocity dispersion. The solid line shows the fit reported 
in equation (AlBl. Dot-dashed and long-dashed lines show given in Tremaine et al. (2002) and Marconi et al. (2004), respectively. 
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Fig. 5. — Black hole abundances associated with some of the 
M, — L relations shown in the previous figure. All curves assume 
the r-band luminosity function from Blanton et al. (2003) except 
for Marconi et al. (2004) which is based on the early-type galaxy 
sample of Bernardi et al. (2003a). Clearly, the fit which produces 
larger black hole masses for a given luminosity results in the most 
supermassive black holes. The hashed region labeled Sheth et al. 
were obtained by combining the (M,\<r) relation of equation (A[5j 
with the observed distribution of velocity dispersions (from Sheth 
et al. 2003). This region is bounded by curves in which the intrinsic 
scatter around the relation (M»\a) was assumed to be 0.16 and 
0.28 dex. 



as that of Tremaine et al. (2002). Therefore, they 
must be using the same M, values as Tremaine et al. 
However, the Tremaine et al. analysis actually assumes 
H Q = 80 km s" 1 Mpc -1 rather than 50 km s" 1 Mpc" 1 . 
To illustrate the effect this has, suppose we rescale the 
M. values in their 2002 fit by (70/80), and the .R-band 
luminosities by (70/50) 2 . This would make their relation 
log M. = -0.5M R - 2.91 + 2.5 log(70/50) - log(70/80) = 
-0.5M R - 2.49; this is a shift in zero point of 0.42. Shift- 
ing this from Rtor using r~R — 0.27 as before yields the 
short dashed line. It is in substantially better agreement 
with the Haring & Rix data than is the dotted line. 

The relations used by McLure & Dunlop and Shankar 
et al. clearly produce smaller black holes for a given lu- 
minosity. The most important effect of this is to decrease 
the L— based estimate of the number of objects with 
M. > 10 9 M©. This is shown in Figure [H The dotted, 
dashed and dot-dashed curves show the result of insert- 
ing the McLure & Dunlop (2002-2004) and Shankar et 
al. (2004) based (M,\L) relations in equation (O, respec- 
tively, when the scatter is assumed to be 0.33 dex. The 
solid line shows 4>(M,) for our fit, and the hashed region 
shows the a— based abundances. Clearly, the (M,\L) re- 
lation with the smallest zero-point, that of McLure & 
Dunlop (2004), produces the fewest massive black holes. 
McLure & Dunlop are able to account for the small dif- 
ference which remains between the L and er-based esti- 
mate by assigning a larger scatter to the M, — a relation, 
0.3 dex, rather than the 0.22 dex which we used to pro- 
duce Figure [3 However, the left hand panel in Figured] 
suggests that the lower zero-point is unacceptably low, 
and 0.3 dex is larger than all recent estimates of the scat- 
ter around (M.|er). 

Marconi et al. (2004) also found consistency between 



() 



the two estimates. They believe that this is because the 
scatter in the M, — L and M. — a relations are similar 
(they believe both are about 0.3 dex). We believe that 
it is their choice of relations combined with the scatter 
around the relations which is the cause of the agreement 
(the analysis of the previous section shows clearly that 
having equal scatter in the two relations is neither suffi- 
cient nor necessary). To illustrate, the long dashed line in 
Figure[5]shows the A/, distribution computed by Marconi 
et al. (2004) from the L-based approach (at lower M, it 
differs from the other works mainly because Marconi et 
al. used the early-type galaxy luminosity function from 
Bernardi et al. 2003b instead of the luminosity function 
of all types from Blanton et al. 2003). Their a-based 
approach gives a similar curve provided one uses their 
M, — (7 relation, shown as the long dashed line in the 
right panel of Figure [?J with intrinsic scatter of 0.3 dex. 
The hashed region labeled Sheth et al. was obtained 
by combining the (M,\a) relation of equation (A[5|) with 
the observed distribution of velocity dispersions (from 
Sheth et al. 2003). This region is bounded by curves 
in which the intrinsic scatter around the relation (M m \a) 
was assumed to be 0.16 and 0.28 dex (note that the larger 
limit is similar to the value used by Marconi et al., i.e. 
0.3 dex). However, in this case the L (long dashed line) 
and a (upper bound of hashed region) based estimates 
are different. Marconi et al. found consistency between 
the two estimates because the M,—a relation they used is 
steeper and shifted to larger M, values than our relation 
(see right hand panel of Figure |4|) , so their as produce 
larger M.S. 

Although Marconi et al. (2004), McLure & Dunlop 
(2004), and Shankar et al. (2004) were able to obtain 
L-based estimates of </>(M.) which were in good agree- 
ment with those based on a, the analysis above suggests 
that this was largely due to a fortuitous inconsistency re- 
sulting from how one rescales M, and L when changing 
the Hubble constant. However, in the next subsection 
we discuss why, if the Hubble-constant related scalings 
are all done self-consistently, then the a— and L— based 
estimates should not have given the same answer! 

3.2. The a — L relation 

Why do our L— and a— based estimates give different 
answers? If we transform the SDSS-B07 luminosity dis- 
tribution into one for a using equations (A llOp and ©, 
and then to a distribution of M, using equations (A[5]) 
and (|3|), then this gives the same answer as transform- 
ing SDSS-B07 luminosity into M. directly using equa- 
tions (A[9]) and ([3]). This is exactly as expected from 
the toy model described in the previous section. How- 
ever the intermediate step provides a predicted velocity 
function which disagrees with the SDSS-B07 one (from 
Sheth et al. 2003). Figure [6] shows this explicitly; the 
hashed region shows the result of starting with the SDSS 
4>(L) and using the black hole (a\L) relation and scat- 
ter (equation fA[10|)) to infer </>(<r). The range of values 
comes from including the uncertainty in the slope and 
scatter of (cr\L). The disagreement with the actual mea- 
sured 0(<r) distribution (solid curve) strongly suggests 
that the a — L relation in the black hole samples is not 
the same as in the SDSS-B07 sample, and that this is the 
source of the discrepancy between the L— and a— based 
estimates. 
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Fig. 6. — Observed and predicted distribution of <r; solid curve 
shows the velocity function reported by Sheth et al. (2003); dot- 
dashed line shows the result of starting from the luminosity func- 
tion of Bernardi et al. (2003b), and using the SDSS (o"|L) relation 
and its scatter to infer <f>(cr), whereas hashed region uses the (cr\L) 
relation and scatter from the black hole sample of Haring & Rix 
(2004) instead. The (cr\L) relation for the black hole sample is 
rather uncertain, since it is derived from only ~ 30 objects: the 
hashed region shows the range of predicted <f>{<r) associated with 
allowing the slope and scatter of the <r — L relation to vary within 
one standard deviation of their rms values. 



In SDSS-B07, 
log o | M, 



255 

= 2.287-— — (M r + 22), (4) 

SDSS-B07 2.5 

with an error in the slope and zero-point of 0.009 and 
0.005, respectively, whereas it is 



logcr|M r ) = 2.42 



0.34 
~2JT 



(M r + 22) 



(5) 



in the Haring & Rix sample (equation A I10|) . The errors 
in the slope and zero-point are 0.02 and 0.01, respec- 
tively. Note that this slope of -0.34/2.5 = -0.14 is 
rather different from the canonical value of —0.10: At a 
given luminosity, the black hole samples have log a larger 
by about 0.08 dex than the SDSS-B07 — observational er- 
rors are typically only about 0.03 dex. 

Yu & Tremaine (2002) also considered the possibility 
that the a — L relation was the cause of the discrepancy, 
and suggested that perhaps there are systematic differ- 
ences between SDSS-B07 velocity dispersions and those 
derived from more local samples. A direct test of this 
possibility is difficult because, of the ~ 30 objects in the 
Haring & Rix compilation, only about ten have SDSS 
imaging, and only NCG 4261 has an SDSS spectrum as 
well. For the objects in common, the SDSS apparent 
magnitudes are about 0.5 mags fainter than those used 
in the black hole analyses, but this is almost certainly 
due to the sky subtraction problems for bright objects 
to which we refered earlier (Hyde et al. 2007). In any 
case, correcting for this will increase the SDSS luminosi- 
ties, further exacerbating the discrepancy in the a — L 
relation. 

A detailed discussion of the a — L relation computed 
from different samples, analysis of systematic biases 
which affect the samples, and the effect of correcting 
"naively" the nearby samples for peculiar velocities, is 
presented in Bernardi (2007) . Compared to any of these 
early-type galaxy samples, the a— L relation in black hole 
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Fig. 7. — Predicted abundances if the luminosities of the black 
hole hosts are modified so that they define a (o\L) relation which 
has the same slope and zero-point as the SDSS-B07 relation. This 
rescaling modifies the (M,\L) relation, but leaves the (M,\a) re- 
lation unchanged. As a result the curve labeled 'Sheth + bulges' 
is the same as before, but 'Blanton + Hyde' now produces many 
fewer massive objects. 

samples is biased to larger a for given L, or to smaller 
L for a given a (Bernardi et al. 2007b). In view of this 
discrepancy, whatever the cause, the fact that McLure 
& Dunlop (2004), Marconi et al. (2004) and Shankar et 
al. (2004) obtained consistent estimates of 0(M # ) from 
both L and a is remarkable indeed. 

Figure [7] shows the result of assuming that the velocity 
dispersion estimates in the black hole sample are reli- 
able, but the distances, and so the luminosities, are not. 
It was constructed by rescaling all the bulge luminosities 
of the black hole hosts so that they define a relation with 
the same slope as in equation (H|), though with different 
scatter. To do so, we added -0.978 + 0.25 (Mr + 22) to 
each of the absolute magnitudes in the black hole sam- 
ple, as suggested by the difference between equations (j4j 
and ©. 

These rescaled luminosities were used to estimate a 
new (Af. \L r ) relation, which was then inserted in equa- 
tion ^ to predict black hole abundances from the lu- 
minosity function. The resulting abundances are consid- 
erably lower, because the rescaled luminosities define a 
considerably shallower (M,\L r ) relation, meaning that 
considerably larger L is required to reach M, > 10 9 M Q . 
While this rescaling is probably unrealistic, we have in- 
cluded the result to illustrate the importance of the a — L 
relation when comparisons of the L~ and cr— based esti- 
mates of 4>(M t ) are made. A more careful accounting of 
the role of selection effects is presented in Bernardi et al. 
(2007b). 

4. DISCUSSION 

It is common to estimate the abundance of supermas- 
sive black holes by combining observed correlations be- 
tween AT. and bulge luminosity or velocity dispersion, 
calibrated from relatively small samples, with luminosity 
or velocity dispersion functions determined from larger 
samples. However, the (M. \a) and (M,\L) relations 
have intrinsic scatter of about 0.22 and 0.33 dex (Ap- 
pendix). Accounting for this results in considerably in- 
creased estimates of the abundance of black holes with 



M. > 1O 9 M , compared to naive estimates which ig- 
nore this scatter. Doing so is at least as important as 
correcting the luminosity function for the fact that the 
most luminous galaxies have non-standard light profiles 
(Figure [l}. Once this scatter has been accounted for, 
the cr-based estimates of 4>(M,) are in reasonably good 
agreement with models, such as that of Hopkins et al. 
(2006), which relate previous QSO and AGN activity to 
the local black hole mass function. The luminosity-based 
estimates, on the other hand, are substantially in excess 
of this model. 

These results follow from using a single power-law to 
parametrize the M, — a and M. — L relations. While this 
may be too simplistic, this parametrization is not the pri- 
mary reason why the L— and a— based approaches yield 
different predictions for black hole abundances. The 
main cause of the discrepancy is that the a— L correlation 
in black hole samples is different from that in the sam- 
ples from which the luminosity and velocity functions are 
drawn: the black hole samples have larger a for a given L 
compared to the ENEAR or SDSS-B07 samples or have 
smaller L for a given a (Bernardi et al. 2007b). If this 
is a physical effect, then it compromises the fundamen- 
tal assumption of black hole demographic studies — that 
all galaxies host black holes. If, on the other hand, it 
is a selection effect, then the M. — a and Af. — L re- 
lations currently in the literature are biased compared 
to the true relations, making current estimates of black 
hole abundances unreliable. If black hole masses corre- 
late with bulge luminosity only because of the M, — a and 
a — L relations, then the bias in the a — L relation is not 
important only if one is using <p(a) to infer black hole 
abundances: the </)(L)-based estimate may be strongly 
affected. 

Identifying the source of the bias is complicated. 
Residuals from the size-luminosity relation are anti- 
correlated with residuals from the a — L relation, as 
might be expected from the virial theorem (Bernardi et 
al. 2003b). If the stellar kinematics method of measuring 
black hole masses favors objects with high surface bright- 
nesses, then one might expect smaller sizes and larger a 
at constant L: this would produce a bias in the sense 
we see. On the other hand, it might be more difficult to 
measure the influence of the black hole on stellar kine- 
matics if the stellar velocity dispersion is already abnor- 
mally large — this would produce a bias in the opposite 
sense. Whether either or these effects has played a role 
in the selection of black samples is an open question. 
See Bernardi et al. (2007b) for further study along these 
lines. 
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APPENDIX 



This Appendix describes our procedure for estimating the slope and scatter associated with (M,\0). Let y — 
log M, — (logM.), and x = O — (O), and let a x , a y and r xy denote the true intrinsic rms values of x, y, and the 
cross-correlation coefficient. Finally, let e x and e y denote the typical measurement errors in determining x and y. In 
practice, the estimated error may vary from object to object; in using a single representative value, our analysis below 
ignores this additional information. Minimizing 



x 2 



J2(y*-^~b) 2 /N (1) 



with respect to a and b yields 

y^ 7 - %iUi O'xO'yTxy O'xO'yTx 



and, because both x and y have zero mean, b m i n = 0. Comparison with equation |T]) shows that a m i n differs from the 
true slope a y \ x because of the measurement errors e 2 . (We have assumed uncorrelated measurement errors in x and 
y. Hence, these errors affect the mean of x 2 , and of y 2 , but not the correlation between x and y.) The scatter around 



this relation is 

.2 _2 2 \ , , 2 , 2 ,2 



c 2 



Xmin =°i<X- 4y) + 4 + fl min 4 ( 1 + ^ J ■ (3) 



The uncertainty on the value of this minimum is x 2 nin ^/2(A r — 1)/N 2 : for N — 24 the estimated scatter is uncertain 

by about ^/A6/24 ~ thirty percent. 

Comparison with equation ^ shows that the first term in the expression above represents the intrinsic scatter 
around the true relation, and the other terms are a consequence of the measurement errors. Hence, the intrinsic slope 
and scatter which we report in the main text are 



1 + ) End ^y|x = Xmin ~4~ "min 4 [ 1 + ^ ) ■ (4) 



C 2 



Notice that a y \ x can be determined well even if e y is large; of course large uncertainties in y do affect the scatter 
around the mean relation. There will be trouble only if a x Cf^ in this case, the large measurement errors in x have 
largely erased the correlation between x and y, so a small measured slope requires a large correction factor to restore 
it to the true value. 

We have applied this procedure to the dataset of Haring & Rix (2004), who provide estimates of M., a, Mb u i gc and 
L r and the fraction of this luminosity which is from the bulge (Appendix A in Bernardi et al. 2007b describes exactly 
how the black hole sample was compiled and the conversion from B, V, R and I-band luminosities to SDSS r— band. 
Both luminosities and the black hole masses were scaled to Hq = 70 km s _1 Mpc -1 ). When doing so we will deal 
almost exclusively with logarithmic quantities; when taking the logarithm, M m is in units of M©, a is in kms -1 , and 
the associated measurement errors are ei og M. ~ 0.2 dex, e\ oga ~ 0.03 dex, and £iogM bulgc ~ 0.18 dex. The scatter 
around the correlations we report are estimates of the intrinsic scatter. The uncertainties in the slope, zero-point, 



and scatter of the following relations were computed by bootstrap resampling. Application of the procedure outlined 
above yields 

(log M.| log a) = (8.21 ± 0.06) + (3.83 ± 0.21) log ( ^q^-i ) (5) 
with intrinsic scatter of 0.22 ± 0.06 dex, and 

(log M.| logM bulge ) = (8.31 ± 0.10) + (1.06 ± 0.12) log (^ ^ j (6) 

with rms scatter 0.33 ± 0.08 dex. Bulge mass and luminosity are tightly correlated (Haring & Rix 2004): 

LogMbulge | M r ) = 11.35 - 0.492 (M r + 22) (7) 



with negligible scatter, so inserting this fit in the previous one yields 

logM.|M r \ =8.69- ^(M r + 22) (8) 

with scatter of 0.33 dex. As a check, we have also fit for the correlation between M. and M r directly, finding 

logM.|M r \ = (8.68 ± 0.10) - ( L3 ° ± °- 15 ) ( Mr + 22) (9) 
/ 2.5 

with scatter of 0.34 ± 0.09 dex. The main text also considered the correlation between L and a in this data set. It is 

log<7|M r \ = (2.42 ± 0.01) - (°- 34±0 - 02 ) ( Mr + 22) (10) 
/ 2.5 

with scatter of 0.04 ± 0.01 dex. Note that this slope is rather different from the canonical value of —0.10. 



